Use of Randomization to Normalize Feature Merits

نویسندگان

  • S. J. Hong
  • J. Hosking
چکیده

Feature merits are used for feature selection in classi cation and regression as well as for decision tree generation. Commonly used merit functions exhibit a bias towards features that take a large variety of values. We present a scheme based on randomization for neutralizing this bias by normalizing the merits. The merit of a feature is normalized by division by the expected merit of a feature that is random noise taking the same distribution of values as the given feature. The noise feature is obtained by randomly permuting the values of the given feature. The scheme can be used for any merit function including the Gini and entropy measures. We demonstrate its e ectiveness by applying it to the contextual merit de ned by Hong (IBM Res. Rep. RC19664, 1994).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optical Character Recognition

Recognition of characters relates to emblematic identity with the image of character. Majority of the OCR systems input characters are first converted to digital form by an optical scanner. Every character is first located and segmented, and the resulting character image is fed into a preprocessor for noise reduction and normalization. Certain characteristics are the extracted from the characte...

متن کامل

Randomization in clinical trials: conclusions and recommendations.

The statistical properties of simple (complete) randomization, permuted-block (or simply blocked) randomization, and the urn adaptive biased-coin randomization are summarized. These procedures are contrasted to covariate adaptive procedures such as minimization and to response adaptive procedures such as the play-the-winner rule. General recommendations are offered regarding the use of complete...

متن کامل

On the choice of adequate randomization ranges for limiting the use of unwanted cues in same-different, dual-pair, and oddity tasks.

A major concern when designing a psychophysical experiment is that participants may use a stimulus feature (cue) other than that intended by the experimenter. One way to avoid this problem is to apply random variations to the corresponding feature across stimulus presentations to make the unwanted cue unreliable. An important question facing experimenters who use this randomization (roving) tec...

متن کامل

Feature Selection Using Multi Objective Genetic Algorithm with Support Vector Machine

Different approaches have been proposed for feature selection to obtain suitable features subset among all features. These methods search feature space for feature subsets which satisfies some criteria or optimizes several objective functions. The objective functions are divided into two main groups: filter and wrapper methods.  In filter methods, features subsets are selected due to some measu...

متن کامل

The role of randomization in clinical trials.

Random assignment of treatments is an essential feature of experimental design in general and clinical trials in particular. It provides broad comparability of treatment groups and validates the use of statistical methods for the analysis of results. Various devices are available for improving the balance of prognostic factors across treatment groups. Several recent initiatives to diminish the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996